-
Notifications
You must be signed in to change notification settings - Fork 600
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cleanup normalize_total #1667
Merged
Merged
Cleanup normalize_total #1667
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
ivirshup
force-pushed
the
fixup-normalize-total
branch
from
March 2, 2021 01:51
288aa54
to
4578d63
Compare
ivirshup
commented
Mar 3, 2021
jlause
added a commit
to jlause/scanpy
that referenced
this pull request
Mar 5, 2021
jlause
added a commit
to jlause/scanpy
that referenced
this pull request
Mar 5, 2021
jlause
added a commit
to jlause/scanpy
that referenced
this pull request
Mar 5, 2021
…rson_residuals() as well
jlause
added a commit
to jlause/scanpy
that referenced
this pull request
Mar 10, 2021
jlause
added a commit
to jlause/scanpy
that referenced
this pull request
Mar 10, 2021
…rson_residuals() as well
Zethson
pushed a commit
that referenced
this pull request
Mar 15, 2021
* Cleanup normalize_total * Add modification tests and copy kwarg for normalize_total * Test that 'layers' argument is deprecated * Added more mutation checks for normalize_total * release note * Error message
ivirshup
added a commit
that referenced
this pull request
Mar 18, 2021
* add flake8 pre-commit Signed-off-by: Zethson <[email protected]> * fix pre-commit Signed-off-by: Zethson <[email protected]> * add E402 to flake8 ignore Signed-off-by: Zethson <[email protected]> * revert neighbors Signed-off-by: Zethson <[email protected]> * fix flake8 Signed-off-by: Zethson <[email protected]> * address review Signed-off-by: Zethson <[email protected]> * fix comment character in .flake8 Signed-off-by: Zethson <[email protected]> * fix test Signed-off-by: Zethson <[email protected]> * black Signed-off-by: Zethson <[email protected]> * review round 2 Signed-off-by: Zethson <[email protected]> * review round 3 Signed-off-by: Zethson <[email protected]> * readded double comments Signed-off-by: Zethson <[email protected]> * Ignoring E262 & reverted comment Signed-off-by: Zethson <[email protected]> * using self for obs_tidy Signed-off-by: Zethson <[email protected]> * Restore setup.py * rm call of black test (#1690) * Fix print_versions for python<3.8 (#1691) * add codecov so we can have a badge to point to (#1693) * Attempt server-side search (#1672) * Fix paga_path (#1047) Fix paga_path Co-authored-by: Isaac Virshup <[email protected]> * Switch to flit This reverts commit d645790 * add setup.py while leaving it ignored * Update install instructions * Circumvent new pip check (see pypa/pip#9628) * Go back to regular pip (#1702) * Go back to regular flit Co-authored-by: Isaac Virshup <[email protected]> * codecov comment (#1704) * Use joblib for parallelism in regress_out (#1695) * Use joblib for parallism in regress_out * release note * fix link in release notes * Add todo for resource test * Add sparsificiation step before sparse-dependent Scrublet calls (#1707) * Add sparsificiation step before sparse-dependent Scrublet calls * Apply sparsification suggestion Co-authored-by: Isaac Virshup <[email protected]> * Fix imports Co-authored-by: Isaac Virshup <[email protected]> * Fix version on Travis (#1713) By default, Travis does `git clone --depth=50` which means the version can’t be detected from the git tag. * `sc.metrics` module (add confusion matrix & Geary's C methods) (#915) * Add `sc.metrics` with `gearys_c` Add a module for computing useful metrics. Started off with Geary's C since I'm using it and finding it useful. I've also got a fairly fast way to calculate it worked out. Unfortunatly my implementation runs into some issues with some global configs set by umap (see lmcinnes/umap#306), so I'm going to see if that can be resolved before changing it. * Add sc.metrics.confusion_matrix * Better tests and output for confusion_matrix * Workaround umap<0.4 and increase numerical stability of gearys_c * Work around lmcinnes/umap#306 by not calling out to kernel function. That code has been kept, but commented out. * Increase numerical stability by casting data to system width. Tests were failing due to instability. * Split up gearys_c tests * Improved unexpected error message * gearys_c working again. Sadly, a bit slower * One option for doc strings * Simplify implementation to use single dispatch * release notes * Fix clipped images in docs (#1717) * Cleanup normalize_total (#1667) * Cleanup normalize_total * Add modification tests and copy kwarg for normalize_total * Test that 'layers' argument is deprecated * Added more mutation checks for normalize_total * release note * Error message * deprecate scvi (#1703) * deprecate scvi * Update .azure-pipelines.yml Co-authored-by: Isaac Virshup <[email protected]> * remove :func: links to scvi in release notes * remove tildes in front of scvi in release notes * Update docs/release-notes/1.5.0.rst Co-authored-by: Michael Jayasuriya <[email protected]> Co-authored-by: Isaac Virshup <[email protected]> * updated ecosystem.rst to add triku (#1722) * Minor addition to contributing docs (#1726) * Preserve category order when groupby is a list (#1735) Preserve category order when groupby is a list * Asymmetrical diverging colormaps and vcenter (#1551) Add vcenter and norm arguments to plotting functions * add flake8 pre-commit Signed-off-by: Zethson <[email protected]> * add E402 to flake8 ignore Signed-off-by: Zethson <[email protected]> * revert neighbors Signed-off-by: Zethson <[email protected]> * address review Signed-off-by: Zethson <[email protected]> * black Signed-off-by: Zethson <[email protected]> * using self for obs_tidy Signed-off-by: Zethson <[email protected]> * rebased Signed-off-by: Zethson <[email protected]> * rebasing Signed-off-by: Zethson <[email protected]> * rebasing Signed-off-by: Zethson <[email protected]> * rebasing Signed-off-by: Zethson <[email protected]> * add flake8 to dev docs Signed-off-by: Zethson <[email protected]> * add autopep8 to pre-commits Signed-off-by: Zethson <[email protected]> * add flake8 ignore docs Signed-off-by: Zethson <[email protected]> * add exception todos Signed-off-by: Zethson <[email protected]> * add ignore directories Signed-off-by: Zethson <[email protected]> * reinstated lambdas Signed-off-by: Zethson <[email protected]> * fix tests Signed-off-by: Zethson <[email protected]> * fix tests Signed-off-by: Zethson <[email protected]> * fix tests Signed-off-by: Zethson <[email protected]> * fix tests Signed-off-by: Zethson <[email protected]> * fix tests Signed-off-by: Zethson <[email protected]> * Add E741 to allowed flake8 violations. Co-authored-by: Isaac Virshup <[email protected]> * Add F811 flake8 ignore for tests Co-authored-by: Isaac Virshup <[email protected]> * Fix mask comparison Co-authored-by: Isaac Virshup <[email protected]> * Fix mask comparison Co-authored-by: Isaac Virshup <[email protected]> * fix flake8 config file Signed-off-by: Zethson <[email protected]> * readded autopep8 Signed-off-by: Zethson <[email protected]> * import Literal Signed-off-by: Zethson <[email protected]> * revert literal import Signed-off-by: Zethson <[email protected]> * fix scatterplot pca import Signed-off-by: Zethson <[email protected]> * false comparison & unused vars Signed-off-by: Zethson <[email protected]> * Add cleaner level determination Co-authored-by: Isaac Virshup <[email protected]> * Fix comment formatting Co-authored-by: Isaac Virshup <[email protected]> * Add smoother dev documentation Co-authored-by: Isaac Virshup <[email protected]> * fix flake8 Signed-off-by: Zethson <[email protected]> * Readd long comment Co-authored-by: Isaac Virshup <[email protected]> * Assuming X as array like Co-authored-by: Isaac Virshup <[email protected]> * fix flake8 Signed-off-by: Zethson <[email protected]> * fix flake8 config Signed-off-by: Zethson <[email protected]> * reverted rank_genes Signed-off-by: Zethson <[email protected]> * fix disp_mean_bin formatting Co-authored-by: Isaac Virshup <[email protected]> * fix formatting Signed-off-by: Zethson <[email protected]> * add final todos Signed-off-by: Zethson <[email protected]> * boolean checks with is Signed-off-by: Zethson <[email protected]> * _dpt formatting Signed-off-by: Zethson <[email protected]> * literal fixes Signed-off-by: Zethson <[email protected]> * links to leafs Signed-off-by: Zethson <[email protected]> * revert paga variable naming Co-authored-by: Philipp A <[email protected]> Co-authored-by: Sergei Rybakov <[email protected]> Co-authored-by: Isaac Virshup <[email protected]> Co-authored-by: Jonathan Manning <[email protected]> Co-authored-by: mjayasur <[email protected]> Co-authored-by: Michael Jayasuriya <[email protected]> Co-authored-by: Alex M. Ascensión <[email protected]> Co-authored-by: Gökçen Eraslan <[email protected]>
ivirshup
added a commit
that referenced
this pull request
Mar 29, 2022
* adding core functions and documentation for pearson residual normalization and hvg selection * adding Pearson residual+PCA bundles, minor bug fixes * some style cleanup, minor fixes * adapting _normalize_pearson_residuals() to cleaned-up _normalized_total() from #1667 * updating layer management as in #1667 for _highly_variable_pearson_residuals() as well * slight performance improvement for sparse input * style cleanup * fixing import issue, fixing docstring style, adding check_values param and warning as in #1642 * fixed small NameError, simplified clip argument * remove pd.categorical() * adding check_values to docstrings and remaining pearson residual functions * np.empty instead of np.nan * add references to docstrings, add HVG details to docstring * exposing pca keyword arguments to the user for the bundle/recipe functions * removed unneeded reversal in hvg, fix kwargs_pca bug, consistent defaults across files * fixing handling of `inplace` and `subset` arguments (see issue #1886), explicit typing of output, adding theta input check * renaming output fields for consistency, fixing minor bug * renaming output fields for consistency * adding function that prepares testdata (used for pearson residual tests) * adding tests for all pearson residual functions * fix precommit high_var_genes * try to get precommit to work * try to get precommit to work * fix recipes * fix normalization * remove relative imports * fix docstrings * retry to build docs * fix highvar docstring * more fixing docstrings * docs build locally ? 🔨 * minor cleanup test normalization * more minor cleanups * final cleanup normalization * fixes high var * init experimental module * fix column ordering for batch case * moving to experimental, minor fix for experimental version of hvg selection * linking tests to new experimental submodule, style cleanup * adapt input arguments and docstring for experimental version of hvg selection function * add recipes * fix docs * add correct module docs * fix recipe docstrings * try fix indentation * fix indentation * fix * new indentation * add space * fixing typo in docstring * renaming pca output fields * adapting tests to new output fieldname * fix docs 🔨 * update docs * fix test 🔨 * ensure argument and docstring consistency * update citation year * cleaning imports in `preprocessing` functions * making inputcheck tests specific to error/warning messages * making inputcheck tests specific to error/warning messages * resolve HVGs across batches more cleanly, fix dtype issue * renaming pca input arguments * renaming pca input arguments * _pca bundle: more efficient copy handling, added input check. both _pca and _recipe: varm field for PCs, adapted tests and docs * move repeated inputcheck code to helpers * merging tests *_values and *_general * condense code in pearson hvg selection test, smaller test data for speedup * condensing code in normalization tests * add asteriks for keyword * updating refs to Genome Biology publication * cleanup helpers.py * cleanup main files as requested by @ivirshup * revert unneeded settingWithCopy fix * cache data * use doc_params for doc * fix doc_params var * finalize docs * fix param doc * wrong var still * add cached datasets module and test on high_var_genes tests * use new cache dataset module for tests * fix precommit * fix docs * fix reference and add notebook to tutorials * add release note * add release note * fix release note * typo * remove duplicate reference * fixing black flake etc requirements * add _pca function to release note * last edits to docs * fix release and tutorial image * try fix pre-commit * minor docs * Remove accidentally included files from merge Co-authored-by: giovp <[email protected]> Co-authored-by: Isaac Virshup <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
I was looking over normalize_total and saw some strange behaviour. Since it's such a common function, I think it's important that it has standard scanpy behaviour. To this end, this PR looks at cleanup up it's code.
Addition
layer
argument. A specific layer can now be normalized by itself.Deprecations
I've deprecated the
layers
andlayer_norm
argument. Normalizing multiple layers at once seems less useful than normalizing a specific layer. These seem like very specific use cases that are easy for user's to implement themselves, and are not common patterns in scanpy functions.TODO: